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Abstract 

We study the random composition of a small family of 0(n 3 ) simple permutations on {0, l} n . 
Specifically we ask how many randomly selected simple permutations need be composed to yield 
a permutation that is close to fc-wise independent. We improve on the results of Gowers ^2] and 
Hoory et al. J3| and show that up to a polylogarithmic factor, n 2 k 2 compositions of random 
permutations from this family suffice. In addition, our results give an explicit construction of a 
degree 0(n 3 ) Cayley graph of the alternating group of 2 n objects with a spectral gap Q(2~ n /n 2 ), 
which is a substantial improvement over previous constructions. 

Keywords: Mixing-time, k-wise independent permutations, cryptography, multicommodity flow, 
reversible computation. 

A naturally occurring question in cryptography is how well the composition of simple permutations 
drawn from a simple distribution resembles a random permutation. Although such constructions are 
a common source of security for block ciphers like DES and AES, their mathematical justification 
(or lack thereof) is troubling. 

This motivated the investigation of Hoory et al. JH] who considered the notion of almost k-wise 
independence. Namely, that the distribution obtained when applying a permutation from a given 
distribution to any k distinct elements is almost indistinguishable from the distribution obtained 
when applying a truly random permutation. Therefore, the question is how close is the composition 
of T random simple permutations to fc-wise independent? 

Another motivation is a fundamental open problem in the theory of expanding graphs. 1 Namely, 
the problem of constructing a constant degree expanding Cayley graph of the symmetric group. A 
possible relaxation of this problem is to ask whether one can find a small set of simple permutations 
such that its action on k points yields an expanding graph. 

It turns out that these two problems reduce to bounding the mixing time and the spectral gap of 
the random walk on the same graph. This walk, P, is defined on the state space of fc-tuples of 
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distinct elements from the n-dimensional binary cube. In each step it randomly selects a simple 
permutation and applies it to each of the k elements at its current position. The mixing time, 
r(e), is the number of steps needed to come e-close to the uniform distribution (in total variation 
distance), and the spectral gap, gap(P), is the difference between the two largest eigenvalues of P's 
transition matrix. 

Following the construction of DES, and previous work by Gowers |12| and Hoory et al ^3], we 
consider the class of width 2 simple permutation, denoted E. The action of such a permutation on 
an element of the n-dimensional binary cube is to XOR a single coordinate with a Boolean function 
of 2 other coordinates; there are 16re(n — l)(n — 2) such permutations. 

These problems were first considered by Gowers |12| who gave an 0(n 3 k(n 2 + k)(n 3 + k)) 2 bound 
on the mixing time, by lower bounding the spectral gap l/gap(P) = 0(n 2 (n 2 + k)(n 3 + k)). 
Subsequently, Hoory et al. ^H] improved the bound on the mixing time to 0(n 3 k 3 ) by proving 
that l/gap(P) = 0(n 2 k 2 ). Both results were achieved using the canonical paths technique, and 
neither result applies for k > 2 n l 2 . Using the comparison technique, in conjunction with the theory 
of reversible computation, we give better bounds for all values of k up to the largest conceivable 
value, k = 2 n - 2. 

Theorem 1. r(e) = 0(n 2 k 2 • log(l/e)), as long as k < 2 n / 50 . 
Theorem 2. l/gap(P) = 0(n 2 k) for all k <2 n — 2. 

Using the well known connection between the mixing time and the spectral gap Theorem |2] implies: 
Corollary 3. r(e) = 0(n 2 k ■ (nk + log(l/e))) for all k < 2 n — 2. 

The proofs of both Theorems are based on the comparison technique for Markov chains 8 . To 
prove Theorem |21 we compare the random walk P either to a Glauber dynamics Markov chain or 
to the random walk on the alternating group using 3-cycles. To prove Theorem [I\we observe that 
after a short preamble the random walk P is almost surely in a "generic" state. Consequently, it 
suffices to bound the mixing time of a Markov chain restricted to "generic" states. To this end 
we again employ the comparison technique, but with a better comparison constant. In all cases 
we construct the multicommodity flows required by the comparison technique using ideas from the 
theory of reversible computation. 

It follows from |131 117j that these results apply also in the more general setting of adaptive adver- 
saries (see references for a definition). 

Corollary 4. Let T be the minimal number of random compositions of independent and uni- 
formly distributed permutations from £ needed to generate a permutation which is e-close to k-wise 
independent against an adaptive adversary. Then T = 0(n 2 k 2 ■ log(l/e)) for k < 2 n / 50 , and 
T = 0(n 2 k ■ {nk + log(l/e))) for k < 2 n - 2. 

1 Preliminaries 

Let / be a random permutation on some base set X. Denote by the set of all fc-tuples of distinct 
elements from X. We say that / is e-close to /c-wise independent if for every (x\, . . . ,x^) 6 X^ 

2 Notation O suppresses a polylogarithmic factor in n and k. 
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the distribution of (f(x\), . . . , f(xk)) is e-close to the uniform distribution on X^ k > . We measure 
the distance between two probability distributions p, q by the total variation distance, defined by 

d(p,q) = ^\\p~q\\i = \ ^\p{u) - g(w)| = max^p(u;) - q(u). 

Assume a group H is acting on a set X and let S be a subset of H closed under inversion. Then 
the Schreier graph G = sc(S,X) is defined by V{G) = X and E(G) = {(x,xs) : x £ X, s € S}. 
For a sequence w = (si,...,Sf) G S 1 we denote xlu = xs\ ■ ■ ■ se, and we sometimes refer by xlo to 
the walk x, xs\, . . . , xs\ ■ ■ ■ S£. 

The random walk Xq, Xi, . . . associated with a d-regular graph G is defined by the transition 
matrix P vu = Pr[Aj + i = u\X{ = v] which is 1/d if (v,u) G E{G) and zero otherwise. The uniform 
distribution ir is stationary for this Markov chain. If G is connected and not bipartite, we know 
that given any initial distribution of Xq, the distribution of Xt tends to the uniform distribution. 
The mixing time of G is r(e) = meLX. ve Y(G) niin{t : d(P^(v, •), 7r) < e}, where ?«(«,.) is the 
probability distribution of given that Xo = u. It is not hard to prove (see Lemma 20]) that 

r(2-^) < r(l/4). (1) 

Let 1 = /3q > fix > ■ ■ ■ > f3\v(G)\ be the eigenvalues of the transition matrix P. We say that this 
random walk is lazy if for some constant 5 > we have P vv > 5 for all v € V(G). We denote the 
spectral gap 1 — flx of the Markov chain P by gap(P). 

Two fundamental results relating the spectral gap of a Markov chain to its mixing time are the 
following: 

Theorem 5. (\U\ Proposition 3]) If the random walk on G is lazy then r(e) = O (log(|V(G)|/e) / gap(P)) . 

Theorem 6. ( [19, Proposition l.ii] or Chapter 4j) For any time reversible Markov chain P 
and e > 0, gap{P) = 0(log(l/2e) /r(e)). 

2 Composing simple permutations 

Another building block that we use are results on reversible computation that enables us to compose 
simple permutations to construct permutations that are easier to work with. A classical result of 
Coppersmith and Grossman jH] is that for n > 3 the set of width 2 simple permutations generates 
exactly the alternating group A n . Thus, all compositions must be even permutations. 

Formally, we define the set of width w simple permutations, T, w , as the set of permutations fi,j t h 
where i £ [n], J = {jx, ■ ■ ■ , j w } is a size w ordered subset of [n] \ {i}, and h is a Boolean 
function on {O,!}" 1 . The permutation fij : h maps (xi,...,x n ) £ {0,1}™ to (xx , ■ ■ ■ , Xi-i , X\ ffi 
h(xj 1 , . . . , Xj w ), Xi+x, . . . , x n ). We are primarily interested in width 2 simple permutations, and 
denote S = £2. 

Theorem 7. (Barenco et al. The permutation that flips the n-th bit of input x if and only if 
the first w bits of x are 1 can be implemented as a composition of 0(w) permutations from S, as 
long as w < n — 2. 
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Theorem 8. (Brodsky ^) for any distinct x,y,z S {0,1}™, one can compose 0(n) permutations 
from £ to obtain the 3-cycle (xyz). 

A length I implementation of the permutation a is a sequence of permutations ai,...,o~£ from £ 
whose composition is a. Theorem |H1 gives a length 0(n) implementation for 3-cycles. We would like 
to use this implementation to construct a multicommodity flow with low load on all edges. However, 
Theorem El does not guarantee this. We solve this problem by randomizing the implementation, 
enabling us to prove a stronger theorem. 

A length I randomized implementation of the permutation a is a sequence of random permutations 
cri, . . . , o~i from £ whose composition is a. In Theorem El we give a randomized implementation for 
3-cycles, such that applying any prefix a\ ■ ■ ■ o~n of the randomized implementation of a uniformly 
random 3-cycle (xyz) to x yields a string that looks random. Namely, its min- entropy floo(") is 
high, which is the minimum amount of information revealed when exposing the value of a random 
variable X, that is H oc (X) = min x (— log 2 (Pr[A = x}))- 

Theorem 9. Let x,y,z G {0, l} n be uniformly distributed and distinct. Then there is a length 
L = 0(n) randomized implementation o~\ - ■ ■ o~l of the 3-cycle (xyz) such that for alii G [L] the min- 
entropy of (xo~\ ■ ■ ■ o~e-\, o~t) (which is a random variable on {0, l} n xT,) is at least log 2 (2 n -n 3 ) — 0(1). 

Note, this implies that the min-entropy of the marginals is big, i.e., H^xai ■ ■ ■ o~i_i) > n — 0(1) 
and Haofa) > log 2 (n 3 ) -O(l). 

3 Proof of Theorem [2] 

In order to prove that the composition of random permutations from £ approaches A:-wise indepen- 
dence quickly we construct the Schreier graph G^ jn = sc(T,, X^), where X^ is the set of fc-tuple 
with k distinct elements from the base set X = {0, l} n . It is convenient to think of X^ as the 
set of k by n binary matrices with distinct rows. A simple permutation acts on X^ by acting on 
each of the rows. Then P is the transition matrix of the random walk on Gk,n- We prove that the 
random walk on this graph mixes rapidly. 

To prove that l/gap(P) = 0(n 2 k), we first observe that gap(P) is monotone nonincreasing in k. 
This follows from the fact that the graph Gk+i, n is a lift of Gk, n and therefore inherits the spectrum 
of Gk, n - To see this, observe that any eigenfunction of Gk, n , can be lifted to an eigenfunction on 
Gk+i,n, where the value of the latter on some k + 1 by n matrix is the value of the former on the 
matrix obtained by deleting the last row. The eigenvalues of these two eigenfunctions is the same. 
In light of this observation, it is sufficient to prove the following two lemmas: 

Lemma 10. l/gap(P) = 0(n 2 ■ 2 n ) for k = 2 n - 2. 

Lemma 11. l/gap(P) = 0(n 2 k) for k < 2 n /3. 

We obtain the lower bound on the spectral gap of P using the comparison technique [Hj. This 
technique enables one to lower bound gap(P) by gap(P)/^4, where P is some other Markov chain, 
and A is the comparison constant. In our case, all chains are walks on regular graphs. An upper 
bound on A is obtained by constructing a multicommodity flow on the underlying graph of P. The 
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flow flows a unit between all pairs of endpoints of edges of P such that the flow through each edge 
of P is small. To prove Lemmas E3 and 111! we compare P to two different Markov chains. We 
start with the first Lemma. 

Proof, (of Lemma imp For k = 2" — 2, the state space of P comprises all even permutations of 
{0, l} n . Let P be a Markov chain on this state space, where in each step we pick three distinct 
elements of the cube x,y,z E {0, 1}™ and perform the permutation (xyz). It follows from a result 
of Friedman that l/gap(P) = @(2 n ). Therefore, it is sufficient to prove that the comparison 
constant of P to P is 0(n 2 ). 3 

To bound the comparison constant A, we need to construct a multicommodity flow / in Gk, n that 
flows a unit between every two matrices M, M' such that P(M, M') > 0. Since the chains P and P 
correspond to random walks on regular graphs with degrees d = B(n 3 ) and d = 0(2 3n ) respectively, 
the formula given in [SJ Theorem 2.3] reduces to: 



Let M, M' be two matrices such that P(M, M') > 0. Then M' can be obtained by applying some 
3-cycle (xyz) to M. Recall that the randomized implementation given by Theorem 03 induces a 
probability distribution on the length L sequences of permutations from £ whose composition is 
(xyz). Such a distribution naturally translates to a distribution on length L paths from M to M'. 
We obtain a unit flow from M to M' by flowing through each such path 7 an amount equal to 
its probability. We claim that the multicommodity flow obtained by repeating this process for all 
M,M' pairs satisfying P(M,M') > yields a small comparison constant. 

Since (7! • (d/d) = 0(n- |S|/2 3n ) for all paths 7 with non-zero flow, the problem of bounding the sum 
in ((21) reduces to bounding the total flow through a given edge e G E(Gk, n )- Let 7 = (Mo, . . . , Ml) 
be a path from M$ to Ml, where Ml is obtained from M$ by applying the 3-cycle (xyz). Assume 
further that 7 goes through the edge e at the £-th step, and that x is the r-th row of M . For any 
of the 6(2 4n • n) possible assignments to x,y, z,£,r, we can determine the distribution of the r-th 
row of the matrices Mo, . . . , Ml- In particular, the probability that (Mg_x,MrJ is equal to e is 
bounded by the probability that they coincide in their r-th row. By Theorem 03 in average over all 
assignments to x,y,z,£,r, this probability is 0(l/2 n |E|). Putting it all together yields that, up to 
a constant factor, the comparison constant A is bounded (n ■ |S|/2 3n ) • (2 4n • n) ■ (1/2™ |S|) = n 2 , as 
claimed. □ 

Proof, (of Lemma lll[) Let P be the a Markov chain on the same state space as P, which is the 
k by n binary matrices with distinct rows. If the current state of P is the matrix M, then the 
next state is determined by picking a row r € {1, . . . , k} and setting it to a random new value 
that is distinct from all other k — 1 rows. The process P is the Markov chain of coloring the 
clique on k vertices with 2™ colors defined in |14l section 4.1]. Proposition 4.5 therein bounds its 
mixing time by f(e) = 0(k\og(k/e)) as long as k < 2™/3. Setting e = l/4k in Theorem El implies 

3 Alternately, one can define a transition of P as performing two random transpositions (not necessarily disjoint) 
and use a result of Diaconis and Shahshahani that l/gap(P) = 9(2 n ). 
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that gap(P) = Vt(\/k). Therefore, as in the proof of Lemma ITUl it is sufficient to prove that the 
comparison constant of P to P is 0(n 2 ). 

Given matrices M, M' such that P(M, M') > 0, we know that M' is obtained from M by changing 
the value of some row r from x to y. To construct paths from M to M' , we note that M' can be 
obtained by applying the 3-cycle (xyz) to M for any z £ {0, l} n that is distinct from all rows of 
M, M'. We choose z at random from the 2™ — (k + 1) allowed values. As in the proof of Lemma ITUl 
the randomized implementation of (xyz), given by Theorem^! defines a distribution on paths from 
M to M' and therefore a multicommodity flow. We turn to bound the comparison constant, given 
by ©. 

As before, I7I • (d/d) = 0(re ■ |S|//c2 n ) for all 7 with non-zero flow, and it suffices to bound the 
flow through some edge e € E(Gk^ n )- We enumerate over the choices of the position t, row r and 
distinct x,y, which make a total of Q(nk2 2n ) possible values. Again we apply TheoremElto argue 
that in average, the probability of agreement with e is bounded by 0(l/|S|2 n ). 4 Therefore, up to 
a constant factor, A = (n ■ |S|/fc2 n ) • (nk2 2n ) ■ (l/|£|2 n ) = re 2 , as claimed. □ 

4 Proof of Theorem [T] 

In light of inequality (|T|). it is sufficient to prove that r(l/4) = 0(n 2 k 2 ). The outline of the proof 
is the following. We start by introducing the notion of a generic matrix, and as suggested by the 
name, most matrices are generic. The proof then proceeds by arguing that after a short random 
walk almost surely all matrices encountered are generic. Therefore, it is sufficient to bound the 
mixing time of a walk that is restricted to generic matrices. For such a walk, we can compare the 
chain to a chain defined only on generic matrices and achieve a much smaller comparison constant. 
This yields the desired bound, 0(n 2 k 2 ). 

Let w = 10 • (log A; + log re). By assumption, we have w < re/4 for a sufficiently large n, and we 
set p = \n/2w~\. Let C\, . . . , C p , C be a partition of [re] such that |Cj| = w for i = 1, . . . ,p and 
|C| = n — pw. Consequently, re/4 < re/2 — w < \C\ < re/2. 

We say that a k by re matrix is generic, if for all J € [p], its restriction to Cj has distinct rows. It is 
not difficult to check that a uniformly distributed matrix M is almost surely generic. In fact, it is 
sufficient that the rows of M are 2 _u, -close to 2- wise independent, since then the probability that 
M is not generic is bounded by p times the probability that the restriction of M to Cj doesn't have 
distinct rows. This yields the bound p- ■ (2 • 2~~ w ) = o(l/re 3 A; 3 ) and implies the following lemma: 

Lemma 12. If the rows of a random k by n matrix M are 2~ w -close to 2-wise independent, then 
M is generic with probability 1 — o(l/re 3 /c 3 ). 

It follows from a result of Chung and Graham about the mixing time of the "Aldous Cube" jS], 
that the number of steps needed to come close to 2-wise independence, which is the same as the 
mixing time of Gi n , is 0(n log re). This is stated in the following lemma (whose proof is deferred 
to Section EJ). 

4 One should note that z is uniformly distributed only over 2" — (k + 1) > 2 n_1 values. However, this is equivalent 
to conditioning a uniform z on an event with probability at least half and therefore (by Lemma H9^ can only increase 
the probability of agreement with e by a factor of two. 
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Lemma 13. For all w > 1 the e mixing time of the Schreier graph sc(E w , X^) is 0(n log nlog(l/e)). 

Therefore, the matrix obtained after T\ = 0(n log n • w) = 0(nlogn ■ (log A; + logn)) steps is 2~ w - 
close to 2-wise independent, and by Lemma IT2*1 it is generic with probability 1 — o(l/n 3 fc 3 ). This 
implies that if we proceed by T 2 = 0(n 3 k 3 ) steps, then all T 2 matrices encountered are generic with 
probability 1 — o(T 2 /n 3 /c 3 ) > 1 — ei, for any fixed t\ > and sufficiently large n. 

We introduce a new Markov chain P' . The state space of P' consists of all generic k by n matrices. 
If the chain is currently at the matrix M, then the next state is determined as follows. We pick a 
uniformly distributed simple permutation a E S. If Ma is generic, we move to Ma. Otherwise, 
we remain at M. Let r'(e) denote the e-mixing time of P' , and require that T2 > t'^) f° r some 
fixed £2 > 0. 

We claim that as long as 2ei + €2 < 1/4 the mixing time of P can be bounded by r(l/4) < T\ + T2. 
To see this, let M be some k by n matrix with distinct rows, and consider following two matrices. 
The first matrix M' obtained when starting at M and walking T\ + T2 steps using P. The second 
matrix M" is defined as follows. Let M be the matrix obtained when starting at M and performing 
T\ steps of P. If M is not generic, we set M" = M. Otherwise, M" is the matrix reached by the 
length T 2 walk using P' that starts at M. We claim that d(M' ', M") < ei and that M" is (e x + e 2 )- 
close to the uniform distribution over k by n matrices with distinct rows 5 . Proving those claims 
will imply that 

r(l/4) < r'(e 2 ) + 0(n log n- (log k + log n)), (3) 
as long as T'(e 2 ) = 0(n 3 k 3 ). 

We start by checking that indeed d(M' , M") < e\. It is convenient to think of the two length 
T\ + T 2 walks from M to M' and M" as defined over the same probability space 

vjTi+t 2 which 

is the choice of a simple permutation in each of the T\ + T 2 steps. Denote the the P-walk by 
(Mo = M, M\, . . . , Mt 1 +t 2 = M'). Then, if all the matrices Mt x , . . . , Mt 1 +t 2 are generic, it 
coincides with the walk leading to M", and in particular we have M' = M" . By the previous 
arguments, this event happens at least with probability 1 — e\, implying that d(M',M") < e\. 

The proof that M" is (ei + e 2 )-close to uniform is more delicate. We know that the matrix M 
is generic with probability at least 1 — e\. Also, since T 2 > r'(e 2 ), we know that conditioned on 
M being generic, M" is e 2 -close to the uniform distribution. Therefore M" is (ei + e 2 )-close to 
the uniform distribution over matrices with distinct rows. This argument can be easily formalized 
using Lemma IT%1 of Section HO 

We are left with the proof of the following lemma. 
Lemma 14. r'(l/4) = 0{n 2 k 2 ). 

To bound the mixing time of the Markov chain P', we apply the comparison technique [S]. We 
compare P' to the Markov chain P defined on the same state space, the k by n generic matrices. 
Given that P is at a matrix M, we determine the next state as follows. With probability half we 
pick a random column c E C and row r E [k] and flip the corresponding bit with probability half. 

5 Note that by our assumptions, the distance between the uniform distribution over matrices with distinct rows 
and generic matrices is o(l) 
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Otherwise, we pick at random an index i G \p], a row r G [k] and a string a G {0, 1} W that is 
distinct from all other k — 1 rows in the restriction of M to the columns Cj. We set the bits at row 
r and columns Cj to a. 

Consequently, the following two lemmas, imply Lemma ITU Note that we need not worry about the 
smallest eigenvalue of P' since a random permutation from £ is the identity with probability 1/16. 

Lemma 15. gap(P) = VL(l/nk). 

Lemma 16. The comparison constant A of P to P' satisfies A = 0(1). 

Proof, (of Lemma H31) 

Consider two Markov chains P\ and P2: 

1. The state space of Pi are the k by w binary matrices with distinct rows. At each step 
one chooses a random row and sets it to a random new value distinct from all other k — 1 
rows. This chain is exactly the coloring chain of a clique on k vertices with 2 W colors of [141 
Proposition 4.5], and as in the proof of Lemma ITT1 it satisfies gap(Pi) = 0,(1/ k). 

2. P2 is the random walk on the (n — wp) ■ k dimensional binary cube, where in each step with 
probability half, one flips a random coordinate. Therefore, gap(i-2) = £l(l/nk). 

One can think of the chain P as the product of p copies of Pi and one copy of P2. Indeed the state 
space of P is the direct product of the p + 1 state spaces. Moreover, a step of P performs a move 
of P2 with probability 1/2 and otherwise performs the move in a randomly selected copy of Pi. It 
is straight forward to check that the spectral gap of P is min(gap(Pi)/p, gap(p2))/2, implying the 
desired bound. 

□ 

Proof, (of Lemma I16[) 

Let G' be the underlying graph of P'. The vertices of G' are the generic k by n matrices, and 
(N,N r ) is an edge of G 1 if P'(N,N') > 0. To bound the comparison constant A, we need to 
construct a multicommodity flow / in G' that flows a unit between every two matrices M, M' 
such that P(M, M') > 0. The chains P' and P correspond to random walks on regular graphs 
with degrees d! = 0(n 3 ), d = Q(kn2 w /w) respectively, and as before the comparison constant A is 
defined by ©. 

To build a path 7 from M to M' we need to distinguish two types of P transitions. Type (i) flips 
the bit at row r and column c G C. Type (ii) changes the bits at row r and columns Cj from a to 
ct . We start by constructing the type (i) paths. 

Let j G [p] be a random index, and let f3 G {0, 1} W be the restriction of the r-th row of M to Cj. 
Also let S be a random sequence of w — 1 distinct elements from C \ {c}. The unit flow from M to 
M' is along paths 7 = 7^/jv//- Each such path is defined by composing simple permutations from 
£ to achieve the permutation that acts on x G {0, l} n by flipping coordinate c if the restriction 
of x to Cj is (i. Clearly such a permutation maps M to M' . We follow the method of Barenco et 
al. to build an AND gate with w inputs. This gate inverts its output bit (the coordinate c) if 
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its w inputs (the coordinates Cj) have some fixed value j3. The coordinates in the set S are used 
as "scratch". 

Let Cj = {ji, . . . ,j w }, S = {si, . . . , and = (pi, ... , b w ). Let oi be the simple permutation 

that flips coordinate s\ of x € {0, l} n if Xj 1 is equal to &i, and let og for 2 < I < w — 1 be the 
simple permutation that flips coordinate sg if a^_ a is one and Xj e is equal to bg. Also, we denote 
by r c the simple permutation that flips x c if is one and Xj w is equal to b w . We claim that the 

following permutation flips coordinate c of x € {0, l} n if the restriction of x to Cj is equal to (3: 

a = (r c a w -i ■ ■ ■ cr 2 <7i<7 2 • • • a w -i) 2 

To see this, one checks by induction that ag ■ ■ ■ ai ■ ■ ■ ag flips coordinate sg if Xj x , . . . , Xj. is equal to 
bi, ...,b£. 

For the type (ii) paths, we need to change the bits at row r and columns C from a to a'. The 
problem is that if we change a to a' bit by bit, as suggested by the construction of type (i) paths, 
we might violate row distinctness. To solve this problem, we start our path by applying a length 
L = 0(wlogw • (1 + 2 log A;)) sequence (ft of simple permutations with indices restricted to C. 
Let M = M(j> and M' = M'<f>, and let C[ and C'( be the first and last [(w - l)/2j columns of 
C{. We say that <fi is valid if for both the restriction of M to C" and for the restriction of M' 
to C[, have distinct rows. By Lemma [B| we know for a random (j>, both M and M' are l/8k 2 - 
close to 2-wise independence. Therefore, a random (j) is not valid with probability bounded by 
k 2 ■ (2- w l 2+1 + l/8fc 2 ) < 1/4. If 4> is valid we define a path 7 = i^tr from M to M '' where 
j G [p] \ {i} and S 1 is a length w — 1 sequence of elements from C. The path is prefixed by to get 
from M to M and is suffixed by _1 to get from M' to M'. Let d and a' be the restriction of the 
r-th row of M and M' to Cj respectively, and let (3 be the restriction of the r-th row of M to Cj. 
Then the middle path connecting M to M' is defined as follows: 

° = t( 1 [ ^)-^ffl-l"-ff2C r lff2"-0'w-l] 2 -[( J [ T c ) • 0\„_1 • • • <T 2 Cri<7 2 • • • <T w _i] 2 , 

where r c and og are as defined for the type (i) sequences. Therefore it is guaranteed that the 
matrices encountered along the first and second half of the sequence agree with M on the columns 
C'l and with M' on the columns C[ respectively. Since <j) is valid, this implies that we never attempt 
to move to a non-generic matrix throughout the entire path. We define the unit flow from M to 
M' by splitting the flow uniformly between all valid paths 7 designated by S, j, (j). 

There are two points that need special attention in the constructed type (i) and type (ii) paths. The 
first point is that all indices of the simple permutations used in <fi are in Cj. This is unacceptable for 
us, as it induces an undue load on a small subset of E. To solve this problem we replace each simple 
permutation used in (j) by a constant length sequence that avoids that problem. For example, the 
permutation that flips coordinate i\ if i 2 and is are 1, denoted Xh ,12,13 > i s replaced by the sequence 
{Xs 2 ,s 1 ,i 3 Xs 1 ,i 2 ,Xs 2 ,s 1 ,i 3 ,Xh,s 2 ) 2 where permutation Xh,i 2 XORs coordinate n with i 2 . 

The second point is that some of the simple permutations used (a\ and some of the permutations 
in 4>) do not use three indices. However, in the definition of E, we have three indices at our disposal 
even if we don't use all three. We use this to guarantee that all simple permutations used have one 
index in Cj and two from S or c for type (i) paths or Cj for type (ii) paths. 



9 



To complete the proof, we have to bound the comparison constant A given by ©• We have d'/d = 
6(n 2 w/k2 w ) and |-y I = 0(L). Also, f(j) is @(w/n(m) w -i) for type (i) paths and 9(u>/|X (u,) \ L n{m) w - X ) 
for type (ii) paths, where we denote m = |C|, (m) q = m(m — l)(m — 2) • • • (m — q + 1), and 
as the width 2 simple permutations restricted to the u>-dimensional cube. Therefore, we only have 
to bound the maximal number of JjJ M i and Jmm' P a t ns through an edge (N,N'). 

We start with type (i) paths. The first step is to extract as much information as possible about 
a path 7 through (N,N') by considering the simple permutation s associated with (N,N'). Note 
first that s determines j. Moreover, since only one of a±, . . . ,cr w —i and r c can be equal to s, any 
path 7 using s, must use it in one of 0(1) possible positions. Since a permutation ai for i € [w — 1] 
or r c determines two indices of S,c there are only 0((m)w-2) choices for S,c that are consistent 
with s. The last thing still needed to reconstruct 7 is the string j3 € {0, 1} W . Since the columns 
Cj are not modified throughout the entire sequence, (5 must be the restriction of some row of N to 
Cj, limiting (3 to one of k possible values. Therefore, the total number of type (i) paths through 
(iV, N') is 0(k ■ (m) w -2), and the contribution of the type (i) sequences to A is: 

d'/d /(7H7I choices for j,S,c,p and position 

A (l) = 0((n 2 w/k2 w ) ■ (I^/mT) • (MH^ ) = 0(Lw 2 /2 w ) = o(l). 

For type (ii) paths we distinguish the cases where (N, N') is in the first middle or last sections 
of a path 7m m 1 ' Consider the first section (and similarly the last). We enumerate over possible 
positions I S [L]. Then we know two indices of the sequence S and one of the 3L indices in Cj 
that where used by 4>. Therefore, we have L ■ (m) w _ 3 ■ \Ti^\ L /w possible values for S, i, 4> and the 
position. This enables us to determine M and M. We still have to determine the row r, the two 
strings a, a' and the index j which have 0{kn2 w /w) possibilities. Therefore the contribution of 
the first and last sections of type (ii) paths is: 

d'/d /(t)-M choices for j,S,i,(f>,a,a',p and position 

^(n.first,last) = 0(^/k^'(Lw/\^\ L (m) w )- \k2 w L ■ (m) J 2 • \^\ L /w 2 ) ) 
= 0{L 2 ) = 0(w 2 log 2 w (1 + log/c) 2 ). 

For the middle section of type (ii) paths, as for the type (i) argument, given (N, N') we first 
determine the position up to O(l) possible choices. Then we determine the index i or j and two 
indices from S, then we have 0((m) w -2 ■ \T,^\ L /w) possibilities for i,j,S,(f>. Also we have k2 w 
choices for the row and the strings f3, a and a' . Therefore, 

d'/d /(tHtI choices for j,s,i,<t>,a,a' ,/3 and position 

^.middle) = 0((n 2 w/k2 w ) ■ (Lw/\^\ L (m) w ) ■ k2 w ■ (m) w . 2 ■ \^\ L /w ) = 0(Lw). 

□ 

5 Proof of Theorem [SI 

First, we describe the randomized implementation of a 3-cycle (xyz) using the simple permutations 
in S. Second, we show that this randomized implementation satisfies the statement of the theorem. 
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The randomness is introduced into the implementation of (xyz) by using a permutation (j) 6 S n 
and two vectors ^4,^5. 

Let 4> be some permutation of the n coordinates. If u; = u\ ■ ■ ■ o~i, implements (x^>,y<j>, Z(f>), then 
u>^ is an implementation of (xyz), where lo^ = (piocj) -1 is the conjugation of u with <fi, i.e. the 
conjugation each of the permutations U{ used in u>. Note that the set S of simple permutations is 
closed under conjugation by permutations from S n , because this just relabels the indices. 

For a vector v G {0, l} n , we denote the first n — 2 bits of v by v' € {0, 1}™~ 2 and the last two bits 
of v by v" £ {0, l} 2 , i.e., v = v'v" . We call the last two bits the control bits. For convenience, the 
notation ?/00, t/01, z/10, and v'll denotes bit vectors comprising the first n — 2 bits of v and the 
control bits 00, 01, 10, and 11, respectively. Let (v)j denote the j-th bit of a vector v. Finally, let 

v\ = x(j), V2 = y4> and ^3 = Z(f>. 

If v[ is equal to v' 2 or to i> 3 then we say that <p is invalid. This can only occur if x, y, or z are 
less than Hamming distance 3 apart and (p maps all indices on which x and y (or z) differ to the 
control indices. For the rest of the description we assume that (j) is valid. Let V4, v§ £ {0, l} n be two 
additional vectors satisfying the validity requirement of being at least Hamming distance 3 from 
each other and from the former three vectors. 

Observe that (^1,^2,^3) = ^1^2 where ipi = (v±, ^2X^4, ^5) and ip2 = (vi,V3)(vi,V5). Therefore 
it suffices to implement the two double transpositions ipi and ip2- These are implemented in an 
identical manner. Each implementation is divided into 15 blocks: a core block, which implements 
the permutation p core = (i^OO, ^OlXt^lO, v' 5 ll), and seven block pairs conjugating it. 

The first four of these blocks, called 7r-blocks ensure that the control bits of each of the four vectors 
are distinct. Specifically, v'jv" is mapped to i^q, where c\ = 00, 02 = 03 = 01, C4 = 10 and C5 = 11. 
If v'l = Ci then the corresponding block, labeled TTi performs a nop. Otherwise, block 7Tj performs 
the permutation (v^v", w-Ci)(u-ai, v'^i) where {aj,fei} = {0, l} 2 \{v" , q}. 

The remaining three blocks, called r-blocks, map v[, v' 2 (or v' 3 ), and v'^ to v' 5 , using the control bits to 
distinguish between the four vectors. Block n performs the permutation t- l = n„'e{o i}"~ 2 (. v ' c i' u ' c i)i 
where u' = v' © v\ © v' 5 . Since it can easily be checked that Tj = r" 1 , that 7Tj = ir^ , and that 
7ri7r27r47r5rir2T4/? core T4r2Ti7r57r 4 7r27ri = ip\ and vri7r37r47r5Tir3T4 / o core T4r3Ti7r 5 7r47r37ri =^2, 
we need only describe the implementation of each of these blocks. 

Each of the blocks is implemented using 0(n) simple permutations. Each r-block is implemented 
by concatenating n — 2 simple permutations, where for j = 1 • • • n — 2, the j-th simple permutation 
is the identity if (v'^j = (v' 5 )j, and otherwise flips the j-th bit of vector v if v" = Ci. 

The implementation of the Pcore and tt blocks is more involved. Permutation p core flips bit {v") 2 if 
and only if v' = v' 5 . Barenco et al j3] showed how such permutations can be implemented using 0(n) 
simple permutations, comprising four sub-blocks: ptopPbotPtopPbot where permutation p top flips bit 
(v")i if the first [~(n — 2) /2] bits of v' match the first [~(n — 2) /2] bits of v 5, and where permutation 
Pbot flips bit [v")2 if the latter [(n — 2)/2j bits of v' match the latter [(n — 2)/2j bits of v' 5 and 
(v")\ = 1. Each sub-block uses the remaining \{n — 2)/2] bits as "scratch", returning them to their 
original state by the end of the sub-block. For details about the construction of the two sub-blocks 
see jH] or Lemma [TBI 

Each block 7Tj is implemented in a similar manner using two permutations that are nearly identical 
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to the implementation of p core . The first (second) permutation performs the identity if (v")\ = (q)i 
(respectively, (u")2 = { c i)2) and otherwise flips bit (v")\ (respectively, (v")2) if v' = v[. 

The length of the implementations of ifti and ip2 is 0(n), since each of the seven blocks can be 
implemented using 0(n) simple permutations from S. The randomize implementation of (xyz) is 
obtained by uniformly choosing at random a valid permutation (ft and the two valid random vectors 

We now prove that this randomized implementation satisfies the statement of the theorem. Let 
f2 = {x, y, z, v±,V5, (ft} be the probability space obtained by uniformly choosing three distinct vectors 
x, y, and z, and then uniformly choosing a corresponding implementation, which is fixed by V4, 
i>5, and (ft. Each point u = (x, y, z, t>4, t>5, (ft) £ O corresponds to an implementation a± • • • 01, of the 
3-cycle (xyz). The size of Q, is 0(2 5n n!), and although not uniform, the probability of each point 
in Q is 0(l/2 5n n!). Thus, our problem of upper-bounding Pr[xaia2 ■ ■ -oi-x = x, o~£ = a] reduces 
to a counting problem. 

For all implementations u) € the indices of the £-th permutation <j£ depend only on its position, 
£, and (ft. Moreover, as we change (ft the indices of the £-th permutation of the implementation of 
(x, y, z, t>4, v$, (ft) agree with the indices of some fixed permutation a only on a subset of S n that is 
of size 0(n!/n 3 ) and depends only on £ and a. 

To establish the theorem we need to prove that for any given (ft the number of choices of x, y, z, 
Vi, and V5, such that xo\Oi • • • cr^_ 1 = x, is 0(2 4n ), implying the number of points in £1 that agree 
with x and a is 0(2 4n n!/n 3 ) = 0(|S7|/2™n 3 ). This is accomplished by the following lemma: 

Lemma 17. Let (ft £ S n be fixed. Then the set of all x,y, z, 1)4,1)5 such that implementation 
corresponding to (x, y, z, 1)4, v^, (ft) satisfies the equality xo~\o~2 ■ ■ ■ 0^-1 = x is of size 0(2 4n ). 

Proof, (of Lemma 117(1 

Let v\ = x(ft, V2 = y(ft, V3 = Z(ft, and v = x(ft. Let 0,%$ be the set of tuples (v%, . . . ,v§) for which 
xo-\o~2 ■ ■ ■ o~i-i = x is satisfied. Note that this set is independent of (ft. Then the claim is that 
Ifi^l = 0(2 4 «). 

The proof is via case analysis with respect to position I. Without loss of generality we assume that 
the position is in the first half of the implementation, that which realizes permutation [v 1, V2){v^ fs), 
otherwise, swapping V2 and v% allows the same argument to be reused for the latter half of the 
implementation. Furthermore, due to symmetry, we assume that the position of I is in or to the 
left of block Pcore- There are four main cases: either I is on a boundary between two blocks, 0. is in 
block Tj, i is in the block /0 core , or i is in block 7Tj. 

7ri 7r2W47r5 T1 T2T4 1 pcore / 

Vi > V X C X > V X C X > W5C1 — > W 5 Ci — > v 5 c 2 ■ ■ ■ 

N v ' S v ' 

v' fixes v' fixes v' 5 

Figure 1: The evolution of v±. 

In the first case, the position, £, is on a block boundary. Since each 7r-block only toggles bits (v")\ 
and (v")2, if position £ is adjacent to a 7r-block, then v' = v[. Thus, all but two bits of v± are fixed 
by v. If position, £, is on a boundary but is not adjacent to a 7r-block, then it must occur after 
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block t\. Since block t\ maps u^OO to t^OO, and none of the remaining blocks, t, or p core , change 
the v' component to any other value, we have v' = v' 5 . Thus, all but two bits of v§ are fixed by v, 
implying that = 0(2 4n ). 

In the second case, the position, £, is inside block Tj. If i ^ 1, then none of the simple permutations in 
block Tj flips a bit. Therefore, the value of v' = v' 5 ; thus fixing all but two bits of v§, as before. If i = 
1, then at position £, we know exactly how many of the n — 2 simple permutations have already been 
performed. Let j be this number. Hence we know that v' = (v' 5 )i, . . . , (v' 5 )j, (v[)j + i, . . . , (v[) n -2- 
Therefore, j bits of ^5 and n — 2 — j bits of v\ are therefore fixed by v, implying that = 0(2 4n ) 

as well. 

In the third case, the position, £, is inside block p CO re- In this case we must look at the sub-blocks 
of the block p CO re- If the position occurs on a sub-block boundary, and since each of the sub-blocks 
simply toggles the bits {v")\ and (v")2> the remaining bits of v' 5 are fixed by v'. If the position I is 
inside a sub-block, then things are only slightly more complicated. Assume that position £ is in a 
ptop sub-block (similar arguments hold for pbot)- Then, p top toggles bit (v")\ if the first half of v' 
matches the first half of v'§. The bits being matched are never modified and the other half of the 
bits of v' are used as "scratch" . We know that the first half of v' and v' 5 coincide throughout the 
block ptop, and therefore v 1 determines this half of v' 5 . The operations on the "scratch" half depends 
only on the fixed half and the position, and therefore can be reversed, reducing the problem to the 
position occurring at the beginning of pt op - Thus, v fixes all but two of the bits of v$. 

In the last case, the position, £, is inside a 7r-block. Block 7Tj comprises two blocks that are similar 
to pcore- Each of the two blocks is either the identity or toggles (v")i or (V')2 if v' = v[. If i = 1 
then the two blocks in Block 7Tj behave in the same manner as block p cor ei except that v fixes all 
but two of the bits of v\ rather than v$. If i ^ 1, then, for the most part, the argument remains the 
same. We need only consider what happens if the position, £, is in one of the eight sub-blocks. As 
mentioned before, half of the bits of v' are not modified by the sub-block, while the other half are 
used as "scratch". Again, without loss of generality, we assume that the sub-block does not modify 
the first half of v 1 . As before, v' fixes the first half of v[. We enumerate on all choices for the first 
half of v\. This enables us to reverse the operations of the sub-block on the "scratch", fixing the 
second half of v[ — as in the third case. This implies that $\ = 0(2 4n ), and completes the proof. 

□ 

6 Odds and Ends 

Proof, (of Lemma I13[) 

We have to prove that for all w > 1 the mixing time of G^l = sc(S w , X^) is 0{n\ogn). 

Given a 2 by n matrix with rows s, t, we change basis to s, u with u = s © t. Let i € [n] be a 
random coordinate, and consider the action of a width w permutations XORing the i-th bit with a 
random function h on w distinct coordinates from [n] \ {i}. We claim that its action on s,u is the 
same as XORing the i-th bit of s and u with two independent random bits a s and a u respectively. 
The bits a s ,a u are one with probability 1/2 and p£ = 1 — f|J =1 (l — ^4j) respectively, where £ is 
the number of ones in u not counting the i-th bit. To see that this is indeed the resulting walk we 
observe the fact that if s and t differ on one of the input bits of the random function h, then the 
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value of the i-th coordinate of s and of t change independently with probability half. Otherwise 
they change simultaneously with probability 1/2. 

The it-component of this walk is a variant of the Aldous cube, and by the comment at the end 
of [3j it follows that this walk mixes in O(nlogn) time. We are left to show that in this time the 
walk on both components mixes. The way to see it is to notice that in O(nlogn) time the event A 
where the indices i assume all possible values in 1,2, ... ,n (coupon collector) happens with high 
probability. Now since the bits a s are independent of a u , we get that even when we condition over 
the walk on the u component, the s component achieves uniform distribution conditioned on A, 
which ends the proof. □ 

Lemma 18. Let A be an event such that ~Pt[A] > 1 — e, and let Z be a random variable over a 
domain Q such that d(Z\A, uniform) < e. Then d(Z, uniform) < 2e. 



Proof. 

\S\ — 151 

d(Z, uniform) = maxPr[Z G S] - j-y < maxPr[Z G S\A] +Pi[A] - j-j < e + d(Z\A, uniform) < 2e. 

□ 

Lemma 19. Let X be a random variable and A an event. Then Pr[X| A] < Pr[X]/PrL4]. (Follows 
from the definition of conditional probability.) 



7 Some concluding remarks 



Let us review what we currently know about the spectral gap of the Markov chain P = Py ' . By 
Theorem |21 gap(P) < Q(l/n 2 k). On the other hand, gap(P) is nonincreasing in k by the lifting 
argument from Section |21 Since for k = 1, P is the standard random walk on the cube, we have 
that gap(P) > 1/n. 

In general, a generating set S for which the spectral gap is large becomes more difficult as k 
increases, until the largest conceivable k, which is 2 n — 2. In this case, this is the random walk on 
the Cayley graph of the alternating group An for N = 2 n with the generating set 5. It is open 
whether one can find a constant size set for which is an expander, |16l Problem 10.3.4]. 6 On 
the other hand, by Alon and Roichman a random set of permutations of size 0(N ■ log N) will 
almost surely have a constant spectral gap. Although smaller expanding sets for Ajy are not known 
to exist, the general belief is that such sets exist; Rozenman, Shalev, and Wigderson assume the 
existence of an iV 1 / 30 expanding set for An, |18| section 1.4]. 

Our results suggest that width 2 permutations may be used to construct an 0(log 3 N) expanding 
set for An- However, several obstacles stand in the way of achieving this goal. The first one is to 
prove that for width 2 permutations the spectral gap does not deteriorate with k, as we believe, 
and is Q(l/n) for all k. The second problem is to achieve a constant gap. To this end, one has 
to overcome the inherent and obvious weakness of the width 2 simple permutations. Namely, that 
their action depends only on two coordinates and changes only one. This leads to poor expansion 

6 The problem of finding a constant size expanding set for An or Sn is equivalent. 
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because there is only a small chance that the action will flip a specific bit or increase the distance 
between two similar vectors. One approach to avoiding this problem is to replace the standard set 
of generators of the cube e\, . . . ,e n with some expanding set of size 0(n). Such an expanding set 
for the cube can readily be constructed from the generating matrix of a good code [Jj, and could 
then be used to define an 0(n 3 ) expanding set of permutations. 
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